Skip to content

Conversation

samwillis
Copy link
Collaborator

@samwillis samwillis commented Oct 2, 2025

stacked on #625

Live Query Scheduler Overview

Overview

This change introduces a scoped, dependency-aware scheduler that guarantees every live-query builder runs at most once per transaction, even when multiple source collections or derived queries fire in the same mutate call. The scheduler groups work by transaction id, dedupes entries by the CollectionConfigBuilder instance, tracks dependencies between builders, and flushes synchronously when Transaction.mutate exits. The net effect: optimistic updates from a single transaction coalesce into a single graph run for each live query, so downstream frameworks see one change batch per transaction.

How scheduling works

  • Each CollectionSubscriber immediately forwards changes to the D2 input and calls CollectionConfigBuilder.scheduleGraphRun.
  • scheduleGraphRun records which upstream builders the current builder depends on (based on the collections it subscribed to) and hands a job to the shared scheduler with:
    • contextId: the transaction id, grouping all work triggered by that transaction.
    • jobId: the builder instance; repeated schedules for the same builder merge into a single entry.
    • dependencies: the set of upstream builders that must finish before this builder can run.
  • The scheduler maintains, per transaction, a queue, entry map, dependency map, and “completed” set. When flushing, it only executes a job once all dependencies are marked completed, deferring the job otherwise. It loops until no work remains and throws if it detects an impossible cycle.

When the graph actually runs

  • Inside Transaction.mutate, we call registerTransaction before the user’s callback and unregisterTransaction in a finally block afterward.
  • registerTransaction clears any stale entries from previous failed scopes.
  • unregisterTransaction calls scheduler.flush(tx.id). flush now loops while a context map exists, so if a job enqueues additional work during its own execution (e.g., the join scheduling itself after its parents run), the scheduler immediately picks up the new entry before leaving the transaction scope.
  • If there is no active transaction, scheduleGraphRun runs maybeRunGraph immediately (backward compatible behaviour).

Nested live queries (the “diamond” cases)

  • Builders register themselves when a live-query collection is created. Whenever a builder subscribes to another live query, it records that dependency.
  • During a transaction, updates enqueue jobs for the parent builders. When those builders finish, they mark themselves as completed, which in turn allows child jobs (such as joins) to run exactly once at the end of the flush.
  • A hybrid variant (join between liveQueryA and raw collectionB) behaves the same way: even if collectionB fires first, the join job is deferred until liveQueryA has finished.
  • Because dependencies are tracked explicitly, there are no deadlocks. If the scheduler ever loops without making progress, it throws a clear “dependency cycle” error.

Order-by loaders and batching

  • The maybeRunGraph loop keeps calling graph.run() while pendingWork() is true. If an order-by loader requests more rows, it pushes those changes into the same D2 input, triggering pendingWork() again.
  • During this loop CollectionConfigBuilder.isGraphRunning is true, so any nested call to scheduleGraphRun is ignored; the loader’s extra rows are consumed by the ongoing loop.
  • As a result, even hungry top-K queries emit a single batch per synchronous transaction run. Multiple batches only occur when a loader deliberately yields results asynchronously (for example, the incremental sync features we’re adding)—in that case the UI should expect staged updates.

Tests

scheduler.test.ts now covers:

  1. Basic single run per transaction (two collections mutated inside one mutate).
  2. Nested transactions.
  3. Rollback cleanup.
  4. Multiple subscribers receiving exactly one batch.
  5. Loader deduping.
  6. Diamond dependency (live query joining two parents) – verifies parents run first and the join runs once (tracked via getRunCount).
  7. Hybrid diamond (join of a live query with a base collection) – same single-run guarantee, even when the raw collection fires first.

Each of the diamond tests mutates once and then performs another transaction with updates, demonstrating that we still emit exactly one additional batch—no duplicates, no missed runs—and that getRunCount() only increments once per transaction.

Summary

  • Scheduler coalesces work by transaction + builder.
  • Transaction.mutate clears before and flushes after every scope, so everything runs synchronously in the optimistic phase.
  • flush loops until no jobs remain, ensuring child live queries (joins) run exactly once per transaction after their parents.
  • Comprehensive tests cover simple, nested, rollback, multi-subscriber, loader, diamond, and hybrid patterns; asynchronous loaders will still produce multiple batches by design.

Copy link

changeset-bot bot commented Oct 2, 2025

🦋 Changeset detected

Latest commit: 4d5ff08

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 12 packages
Name Type
@tanstack/db Patch
@tanstack/angular-db Patch
@tanstack/electric-db-collection Patch
@tanstack/query-db-collection Patch
@tanstack/react-db Patch
@tanstack/rxdb-db-collection Patch
@tanstack/solid-db Patch
@tanstack/svelte-db Patch
@tanstack/trailbase-db-collection Patch
@tanstack/vue-db Patch
todos Patch
@tanstack/db-example-react-todo Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

pkg-pr-new bot commented Oct 2, 2025

More templates

@tanstack/angular-db

npm i https://pkg.pr.new/@tanstack/angular-db@628

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@628

@tanstack/db-ivm

npm i https://pkg.pr.new/@tanstack/db-ivm@628

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@628

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@628

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@628

@tanstack/rxdb-db-collection

npm i https://pkg.pr.new/@tanstack/rxdb-db-collection@628

@tanstack/solid-db

npm i https://pkg.pr.new/@tanstack/solid-db@628

@tanstack/svelte-db

npm i https://pkg.pr.new/@tanstack/svelte-db@628

@tanstack/trailbase-db-collection

npm i https://pkg.pr.new/@tanstack/trailbase-db-collection@628

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@628

commit: 4d5ff08

Copy link
Contributor

github-actions bot commented Oct 2, 2025

Size Change: +2.97 kB (+3.95%)

Total Size: 78.2 kB

Filename Size Change
./packages/db/dist/esm/query/live-query-collection.js 416 B +76 B (+22.35%) 🚨
./packages/db/dist/esm/query/live/collection-config-builder.js 4.43 kB +1.18 kB (+36.44%) 🚨
./packages/db/dist/esm/query/live/collection-subscriber.js 1.83 kB +77 B (+4.39%)
./packages/db/dist/esm/transactions.js 3.08 kB +44 B (+1.45%)
./packages/db/dist/esm/query/live/collection-registry.js 349 B +349 B (new file) 🆕
./packages/db/dist/esm/scheduler.js 1.24 kB +1.24 kB (new file) 🆕
ℹ️ View Unchanged
Filename Size
./packages/db/dist/esm/collection/change-events.js 958 B
./packages/db/dist/esm/collection/changes.js 1.01 kB
./packages/db/dist/esm/collection/events.js 683 B
./packages/db/dist/esm/collection/index.js 3.14 kB
./packages/db/dist/esm/collection/indexes.js 1.16 kB
./packages/db/dist/esm/collection/lifecycle.js 1.8 kB
./packages/db/dist/esm/collection/mutations.js 2.59 kB
./packages/db/dist/esm/collection/state.js 3.81 kB
./packages/db/dist/esm/collection/subscription.js 1.69 kB
./packages/db/dist/esm/collection/sync.js 1.32 kB
./packages/db/dist/esm/deferred.js 230 B
./packages/db/dist/esm/errors.js 3.46 kB
./packages/db/dist/esm/index.js 1.6 kB
./packages/db/dist/esm/indexes/auto-index.js 745 B
./packages/db/dist/esm/indexes/base-index.js 605 B
./packages/db/dist/esm/indexes/btree-index.js 1.82 kB
./packages/db/dist/esm/indexes/lazy-index.js 1.25 kB
./packages/db/dist/esm/local-only.js 827 B
./packages/db/dist/esm/local-storage.js 2.02 kB
./packages/db/dist/esm/optimistic-action.js 294 B
./packages/db/dist/esm/proxy.js 3.87 kB
./packages/db/dist/esm/query/builder/functions.js 615 B
./packages/db/dist/esm/query/builder/index.js 3.93 kB
./packages/db/dist/esm/query/builder/ref-proxy.js 938 B
./packages/db/dist/esm/query/compiler/evaluators.js 1.56 kB
./packages/db/dist/esm/query/compiler/expressions.js 631 B
./packages/db/dist/esm/query/compiler/group-by.js 2.11 kB
./packages/db/dist/esm/query/compiler/index.js 2.19 kB
./packages/db/dist/esm/query/compiler/joins.js 2.67 kB
./packages/db/dist/esm/query/compiler/order-by.js 1.27 kB
./packages/db/dist/esm/query/compiler/select.js 1.28 kB
./packages/db/dist/esm/query/ir.js 785 B
./packages/db/dist/esm/query/optimizer.js 3.1 kB
./packages/db/dist/esm/SortedMap.js 1.24 kB
./packages/db/dist/esm/utils.js 943 B
./packages/db/dist/esm/utils/browser-polyfills.js 365 B
./packages/db/dist/esm/utils/btree.js 6.02 kB
./packages/db/dist/esm/utils/comparison.js 754 B
./packages/db/dist/esm/utils/index-optimization.js 1.62 kB

compressed-size-action::db-package-size

Copy link
Contributor

github-actions bot commented Oct 2, 2025

Size Change: 0 B

Total Size: 1.44 kB

ℹ️ View Unchanged
Filename Size
./packages/react-db/dist/esm/index.js 152 B
./packages/react-db/dist/esm/useLiveQuery.js 1.29 kB

compressed-size-action::react-db-package-size

@samwillis samwillis marked this pull request as draft October 2, 2025 14:20
@samwillis samwillis marked this pull request as ready for review October 4, 2025 11:36
@samwillis samwillis marked this pull request as draft October 4, 2025 11:59
@samwillis samwillis marked this pull request as ready for review October 4, 2025 14:55
@samwillis samwillis requested review from kevin-dp and KyleAMathews and removed request for kevin-dp October 4, 2025 14:55
} from "./types.js"

export type RunCountUtils = UtilsRecord & {
getRunCount: () => number
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a utils.getRunCount() in order to check this in the tests easily. We should extend this with more run metadata and timing, will be useful for dev tools.

Copy link
Contributor

@kevin-dp kevin-dp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a close look at this PR and left some comments. My main concern is the time complexity of the scheduler's flush method.

"@tanstack/db": patch
---

Add a scheduler that ensures that if a transaction touches multiple collections that feed into a single live query, the liver query only emits a single batch of updates. This fixes an issue where multiple renders could be triggered from a live query under this situation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: the liver query

const collection = bridgeToCreateCollection(options)

if (config.utils) {
Object.assign(collection.utils, config.utils)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird. Why do we need to mutate it after the facts instead of passing it as an option when calling bridgeToCreateCollection as we used to do before.

): Collection<TResult, string | number, TUtils> {
// This is the only place we need a type assertion, hidden from user API
return createCollection(options as any) as unknown as Collection<
function bridgeToCreateCollection<TResult extends object>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing the TUtils type parameter? Because of that the return type is now less precise and we have to explicitly cast it on L140.

private readonly aliasDependencies: Record<
string,
Array<CollectionConfigBuilder<any, any>>
> = Object.create(null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need Object.create(null) ? Can't we just use an object literal {}?

}
} catch (error) {
allDone = false
if (firstError === undefined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can simplify to firstError ??= error

import type { CollectionConfigBuilder } from "./collection-config-builder.js"
import type { CollectionSubscription } from "../../collection/subscription.js"

const loadMoreCallbackSymbol = Symbol(`tanstack.db.loadMore`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I propose we change this one to @tanstack/db.loadMore to be consistent with the naming we used in collection-registry.ts:

const BUILDER_SYMBOL = Symbol.for(`@tanstack/db.collection-config-builder`)


// Do not provide the callback that loads more data
// if there's no more data to load
// otherwise we end up in an infinite loop trying to load more data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is useful to keep


// We need to call `maybeRunGraph` even if there's no data to load
// because we need to mark the collection as ready if it's not already
// and that's only done in `maybeRunGraph`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is also useful to keep but may need to be changed slightly (at least the references to maybeRunGraph need to be changed to scheduleGraphRun)

const subscriptionWithLoader = subscription as SubscriptionWithLoader

const boundLoader =
subscriptionWithLoader[loadMoreCallbackSymbol] ??
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i'm not mistaked, the trick we do with ?? here is exactly what ??= does so we can rewrite this as:

subscriptionWithLoader[loadMoreCallbackSymbol] ??=
  this.loadMoreIfNeeded.bind(this, subscription)

this.loadMoreIfNeeded.bind(this, subscription)
)

// Cache the bound loader on the subscription using a symbol property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"bound loader" is quite cryptic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants